Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low precision types support in Convert operation #5640

Merged
merged 15 commits into from
May 21, 2021

Conversation

jdanieck
Copy link
Contributor

@jdanieck jdanieck commented May 14, 2021

Details:

  • Add u1 type support + tests
  • Add u4 type support + tests
  • Add i4 type support + tests
  • Extended Serialization SLT with LP types
  • Added Convert to summarize.py report

Tickets:

  • 46096

@jdanieck jdanieck added the category: Core OpenVINO Core (aka ngraph) label May 14, 2021
@jdanieck jdanieck self-assigned this May 14, 2021
@jdanieck jdanieck force-pushed the convert_lp_types branch from 9d08366 to daebf97 Compare May 14, 2021 16:26
@jdanieck jdanieck marked this pull request as ready for review May 20, 2021 06:23
@jdanieck jdanieck requested review from a team, ilyachur, pelszkow and jane-intel and removed request for a team May 20, 2021 06:23
@jdanieck jdanieck added this to the 2021.4 milestone May 20, 2021
@jdanieck
Copy link
Contributor Author

jdanieck commented May 20, 2021

@ilyachur @jane-intel @pelszkow guys I think we should introduce general support of LP types to ngraph. We should consider introduction of ngraph::u1 type & vector<ngraph::u1> (an the same for u4 & i4) where we could encapsulate all this "binary magic". Also we will be able to dispatch templates against this type and avoid "if(LP_TYPE)" tricks like in this PR. Please notice that I needed to repeat LP types handling code from Constant here, and who knows how many times we'll need it in the future for other operations. What do you think ?

@jdanieck jdanieck changed the title Low precision types support in Convert reference implementation Low precision types support in Convert operation May 20, 2021
@ilyachur
Copy link
Contributor

@ilyachur @jane-intel @pelszkow guys I think we should introduce general support of LP types to ngraph. We should consider introduction of ngraph::u1 type & vector<ngraph::u1> (an the same for u4 & i4) where we could encapsulate all this "binary magic". Also we will be able to dispatch templates against this type and avoid "if(LP_TYPE)" tricks like in this PR. Please notice that I needed to repeat LP types handling code from Constant here, and who knows how many times we'll need it in the future for other operations. What do you think ?

The LP types are introduced in order to minimize the size of weights. If we want to work with vector of u1 or other LP types it means what vector will contain holes (for u1 the size of each element in vector is 1 byte not 1 bit), it means that we will increase the memory consumption.

@jdanieck
Copy link
Contributor Author

jdanieck commented May 20, 2021

@ilyachur @jane-intel @pelszkow guys I think we should introduce general support of LP types to ngraph. We should consider introduction of ngraph::u1 type & vector<ngraph::u1> (an the same for u4 & i4) where we could encapsulate all this "binary magic". Also we will be able to dispatch templates against this type and avoid "if(LP_TYPE)" tricks like in this PR. Please notice that I needed to repeat LP types handling code from Constant here, and who knows how many times we'll need it in the future for other operations. What do you think ?

The LP types are introduced in order to minimize the size of weights. If we want to work with vector of u1 or other LP types it means what vector will contain holes (for u1 the size of each element in vector is 1 byte not 1 bit), it means that we will increase the memory consumption.

I was thinking about specialization like with vector of bool which would allow us to keep the same memory usage.

My main point is that we should have some common place for code handling LP types in ngraph, implementation details we can discuss when we agree that it's needed. If you want to think about it on concreate use case you can look at TestCase class and think how to properly introduce LP types handling there (at this point they are not handled).

@ilyachur
Copy link
Contributor

ilyachur commented May 20, 2021

I was thinking about specialization like with vector of bool which would allow us to keep the same memory usage.

My main point is that we should have some common place for code handling LP types in ngraph, implementation details we can discuss when we agree that it's needed. If you want to think about it on concreate use case you can look at TestCase class and think how to properly introduce LP types handling there (at this point they are not handled).

Yeah, I know about the vector of bool, but unfortunately we also have u4, i4 lp types. And for such types we don't have a good solution

@jdanieck
Copy link
Contributor Author

I was thinking about specialization like with vector of bool which would allow us to keep the same memory usage.
My main point is that we should have some common place for code handling LP types in ngraph, implementation details we can discuss when we agree that it's needed. If you want to think about it on concreate use case you can look at TestCase class and think how to properly introduce LP types handling there (at this point they are not handled).

Yeah, I know about the vector of bool, but unfortunately we also have u4, i4 lp types. And for such types we don't have a good solution

I am really bad in communicating today :) I think it is technically possible to implement our own specialization of vector, e.g. vector<ngraph::u4> & vector<ngraph::i4> and follow the same implementation concept as in vector<bool> to decrease memory usage. @pelszkow could you confirm it is technically possible? I think we should at least consider it if is technically possible, and if we conclude it is too much work then we can reject.

@pelszkow
Copy link
Contributor

It looks like we should provide some tool which will be some mix of std::vector<bool> and std::span. This class should allow to iterate over the buffer/memory and access to LP like std::vector<bool> does with bits/bools.
e.g.:

template<int BitsNo>
struct Span{
    class Iterator;
    BitWrapper <BitsNo> operator[](size_t i);
    Iterator begin();
    Iterator end();
    //....
};

template<int BitsNo>
struct SpanIterator{
    BitWrapper <BitsNo> operator*();
    operator++();
    //....
}

template<int BitsNo>
struct BitWrapper {
    BitWrapper (uint8_t* mem, int index);
    operator uint8_t () const {/*do bit magic */}
    BitWrapper operator=(uint8_t value) {/*do bit magic */}
    //....
}

@ilyachur
Copy link
Contributor

I was thinking about specialization like with vector of bool which would allow us to keep the same memory usage.
My main point is that we should have some common place for code handling LP types in ngraph, implementation details we can discuss when we agree that it's needed. If you want to think about it on concreate use case you can look at TestCase class and think how to properly introduce LP types handling there (at this point they are not handled).

Yeah, I know about the vector of bool, but unfortunately we also have u4, i4 lp types. And for such types we don't have a good solution

I am really bad in communicating today :) I think it is technically possible to implement our own specialization of vector, e.g. vector<ngraph::u4> & vector<ngraph::i4> and follow the same implementation concept as in vector<bool> to decrease memory usage. @pelszkow could you confirm it is technically possible? I think we should at least consider it if is technically possible, and if we conclude it is too much work then we can reject.

Oh, ok, now I got your point. So in this case, I think we need to try to implement it.

@jdanieck
Copy link
Contributor Author

I was thinking about specialization like with vector of bool which would allow us to keep the same memory usage.
My main point is that we should have some common place for code handling LP types in ngraph, implementation details we can discuss when we agree that it's needed. If you want to think about it on concreate use case you can look at TestCase class and think how to properly introduce LP types handling there (at this point they are not handled).

Yeah, I know about the vector of bool, but unfortunately we also have u4, i4 lp types. And for such types we don't have a good solution

I am really bad in communicating today :) I think it is technically possible to implement our own specialization of vector, e.g. vector<ngraph::u4> & vector<ngraph::i4> and follow the same implementation concept as in vector<bool> to decrease memory usage. @pelszkow could you confirm it is technically possible? I think we should at least consider it if is technically possible, and if we conclude it is too much work then we can reject.

Oh, ok, now I got your point. So in this case, I think we need to try to implement it.

Ok, so just to be on the same page, I'll create separate ticket for this. Let's merge this PR as is, including your comments ofz.

@jdanieck
Copy link
Contributor Author

@ilyachur any chance you will review today? I'd love to merge it before FF :)

@ilyachur ilyachur merged commit 9e46be7 into openvinotoolkit:master May 21, 2021
@ilyachur
Copy link
Contributor

@jdanieck I merged this PR, can we add more tests in the next PR?

@jdanieck jdanieck deleted the convert_lp_types branch May 21, 2021 10:33
yekruglov pushed a commit to yekruglov/openvino that referenced this pull request Jun 7, 2021
* Add initial version of u1 type support.

* Turn off u8_to_u1 test in IE.CPU.

* Fix compilation issue.

* Replace std::memset with std::fill.

* Add u4 type support.

* Add i4 support.

* LP types support generalized.

* Remove std::copy optimization.

* Fix backend test for LP types.

* Fixed arm plugin compilation.

* Add LP types to Serialization SLT.

* Add Convert to summarize.py report.
rnugmanx pushed a commit to rnugmanx/openvino that referenced this pull request Aug 26, 2021
* Add initial version of u1 type support.

* Turn off u8_to_u1 test in IE.CPU.

* Fix compilation issue.

* Replace std::memset with std::fill.

* Add u4 type support.

* Add i4 support.

* LP types support generalized.

* Remove std::copy optimization.

* Fix backend test for LP types.

* Fixed arm plugin compilation.

* Add LP types to Serialization SLT.

* Add Convert to summarize.py report.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Core OpenVINO Core (aka ngraph)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants